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Abstract. We consider a two colors Polya urn with balance S. Assume it is 
a large urn i.e. the second eigenvalue m of the replacement matrix satisfies 
,_i 1/2 < m/S < 1. After n drawings, the composition vector has asymptotically a 

^ first deterministic term of order n and a second random term of order n m ^ s . The 

object of interest is the limit distribution of this random term. 

The method consists in embedding the discrete time urn in continuous time, 
|V j getting a two type branching process. The dislocation equations associated with 

PLh this process lead to a system of two differential equations satisfied by the Fourier 

transforms of the limit distributions. The resolution is carried out and it turns 
out that the Fourier transforms are explicitely related to Abelian integrals on the 
Fermat curve of degree m. 
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1 Introduction 

Consider a two colors Polya-Eggenberger urn random process, with replacement 
matrix R = ( ^ ) ' ° n6 s ^ ar ^ s w ^ n an urn containing red and black balls, 

initial composition of the urn. At each discrete time n, one draws a ball equally 
likely in the urn, notices its color, puts it back into the urn and adds balls with 
the following rule: if the drawn ball is red, one adds a red balls and b black balls; 
if the drawn ball is black, one adds c red balls and d black balls. The integers 
a, b, c, d are supposed to be nonnegative^] and the urn is assumed to be balanced, 
which means that the total number of balls added at each step is a constant, 
equal to S = a + b = c + d. The composition vector of the urn at time n is 

3 One admits classically negative values for a and d, together with arithmetical conditions on 
c and b. Nevertheless, the paper deals with so-called large urns, for which this never happens. 
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denoted by 



jjDTt \ _ f # re d balls at time n 
# black balls at time n 



It is a random vector and the article deals with its asymptotics when n goes off 
to infinity. All along the paper, the index DT is used to refer to discrete time 
objects while CT will refer to continuous time ones. 

Since the original Polya's paper [17] , this question has been extensively studied 
so that citing all contributions has become hopeless. The following references give 
however a good idea of the variety of methods: combinatorics with many papers 
by H. Mahmoud (see his recent book [15]), probabilistic methods by means of 
embedding the process in continuous time (see Janson [9]), analytic combinatorics 
by Flajolet et al. ([7]) and a more algebraic approach in (|J9J). The union of these 
papers is sufficiently well documented, guiding the reader to a quasi exhaustive 
bibliography. 

The asymptotic behaviour of U DT (n) is closely related to the spectral de- 
composition of the replacement matrix. In case of two colors, R is equivalent 
SO 



to „ , where the largest eigenvalue is the balance S and the smallest 

y m I 

eigenvalue is the nonnegative integer m = a — c = d — b. We denote by a the 
ratio between the two eigenvalues: 

m 

a=-<l. 

It is well known that the asymptotics of the process has two different behaviours 
depending on the position of o with respect to the value 1/2. Briefly said, 

1. when cr < |, the urn is called small and, except when R is triangular, the 
composition vector is asymptotically Gaussiarj^j 



>n n- 

where V\ is a suitable eigenvector of l R relative to S and S 2 has a simple closed 
form; 

2. when | < a < 1, the urn is called large and the composition vector has a 
quite different strong asymptotic form: 

U DT (n) = nv x + rfW DT v 2 + o(n a ) (1) 



*The case a — 1/2 is similar to this one, the normalisation being \J n log n instead of 
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where Ui,i>2 are suitable eigenvectors of l R relative to S and m, W DT is a real- 
valued random variable arising as the limit of a martingale, o( ) meaning a.s. and 
in any L p ,p > 1 as well. The moments of W DT can be recursively calculated and 
have no known closed form (|19j). 

In the present article, the object of interest is the distribution of W DT for 
large urns. 

The first step consists in classically embedding the discrete time process 
(JJ DT {n)) n >o into a continuous time Markov process (U CT (t)) t > , by equipping 
each ball with an exponential clock. At any n-th jump time r n , the continuous 
time process U CT (r n ) has the same distribution as U DT (n). This connection be- 
tween both processes is the key point, allowing us to work on the continuous time 
one, where independence properties have been gained. 



In Theorem 3.3, we show that, in the case of large urns, the continuous time 



process satisfies, when t tends to infinity, the following asymptotics: 

U CT (t) = e 5t ^i(l + o(l)) + e mt W CT v 2 (l + o(l), (2) 

where V\ and v-i are the same vectors as above, £ and W CT are real- valued random 
variables that arise as limits of martingales, o( ) meaning in any L p ,p > 1. 
Moreover, we prove that £ is Gamma-distributed. These results are based on 
the spectral decomposition of the infinitesimal generator of the continuous time 
process on two-variables polynomials spaces. 

Fortunately, relying on the embedding connection, the two random variables 



W DT and W CT are connected (Theorem 3.10): 



W CT = C W DT a.s. 

£ and W DT being independent. Since £ CT is invertible, the attention is focused on 
the determination of W s distribution. 

Because of the nonnegativeness of i?'s entries, (U CT (t)) t > is a two-type 
branching process, concretized as a random tree: the branching property gives 
rise to dislocation equations on U CT {t). If one denotes by T (resp. Q) the charac- 
teristic function of W starting from one red ball and no black ball (resp. no red 
ball, one black ball), the independence of the subtrees in the branching process 
implies that the characteristic function of any W CT starting from a red balls and 
(3 black balls is the product T a Q^ . Furthermore, the dislocation equations on 
U CT (t) lead to the following differential system 

F{x) + mxT\x) = J^(x) a+1 ^(x) 6 

(3) 

g{x) + mxQ\x) = F{x) c g(x) d+1 
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with some boundary conditions. Notice that the corresponding exponential mo- 
ments generating series (Laplace series) are also solutions of (|3]), but their radius 



of convergence is equal to 0. This is detailed in Section 8.2 

The resolution of System (|3| is achieved in Section [6j where it is shown that 
J- and Q are explicit in terms of inverse functions of Abelian integrals on the 
Fermat curve of degree m: for any complex 2 in a suitable open subset of C, let 



/,,.s./,c)= / a+u™- <h! 



2,2:00) 



u 



5+1 



where [z, zoo) denotes the ray {tz, t > 1}. The function I m ,s,b defines a conformal 
mapping on the open sector V m = {2^0, < arg(z) < n/m}. If J m> s,b denotes 
the holomorphic function, defined on the lower half-plane as left-reciprocal appli- 
cation of I m ,s,b and extended at the whole standard slit plane by conjugacy, the 
closed form of T and Q are given in the following result. 

Theorem For any x > 0, 

_i / K s _s_ 

F(X) = KX rn J m S b C + —X m 



S 

s 



K s 

Q(x) = Kx J m , S ,c ( C + —x 



where K 6 C and C < are explicit constants. 



For precise statements and proofs, see Section (^3 and Theorem 6.22 

The resolution of System ^ is made by a ramified change of variable and 
functions, leading to the following monomial system: 

9' = f c 9 d+1 - 1 } 

This remarkable fact has to be related to the study of the composition's law at 
finite time of small discrete urns by means of generating functions of all possible 
histories, as it is beautifully handled by Flajolet et al. ([7]). Their method leads 
directly to the same System Q on generating functions. The assumption a < 1/2, 
transposed on the four parameters a, b, c and d, does not fundamentally change 
the system's treatment but requires completely different analytic considerations 
on its solutions. 

The limit laws of the W s appear as a new family of probability distribu- 



tions, indexed by three parameters S,m,b submitted to assumptions (11) and 



by initial conditions a, (3. We prove in Section [JJ that they admit densities that 
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can be expressed by means of inverse Fourier transform of their characteristic 
function's derivative. Furthermore, they are infinitely divisible and their support 
is the whole real line, the radius of convergence of their exponential moments 
generating series being equal to 0. 

Many questions remain open. For instance, are these distributions charac- 
terized by their moments? What is the precise asymptotics of their densities at 
infinity (tails)? It is shown in [20J that, for triangular and nondiagonal replace- 
ment matrices, the discrete time limit law W DT is never infinitely divisible; does 
this situation always remain in the present nontriangular case? 

Notation: in the whole text, N is the set of nonnegative integers. 



2 The model 



2.1 Definition of the process 



Let a, b, c and d be nonnegative integers such that a + b = c + d=: S and R be 
the matrix 

a b \ i a S — a , 

(5) 



R 



c d 



S-d 



d 



The discrete time Polya-Eggenberger urn process associated with the replacement 
matrix R, that has been heuristically described in the introduction, is the N 2 \{0}- 
valued Markov sequence (U DT (n),n G N) whose transition probability at any 
nonzero point (x, y) G N 2 is 



x 



x + y 



$(x+a,y+b) + 



y 



x + y 



'(x+c,y+d)i 



(6) 



where 5 V denotes Dirac point mass at v G N 2 . This means that (U DT (n),n G N) 
is a random walk in N 2 \ {0} (or in the two-dimensional one column nonzero 
matrices with nonnegative integer entries, we will use both notations) recursively 
defined by the conditional probabilities 



//) \ h )\U DT {n) 



p(U DT (n + l) = U DT 
P (u DT (n+l) = U DT (n)+(^\\U 



n) 



x 



x + y 

y 

x + y 
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In the whole text, 



(tfS5)(n),n>0) 



will denote the process starting from the nonzero vector (a, (3) and 

u :=<* + /? = 1^(0) 

will denote the total number of balls at time 0. Notice that the balance property 
S=a+b=c+d implies that the total number of balls at time n, when 
U DT (n) = (x, y), is the (nonrandom) number x + y = u + nS. 

If one denotes by w± = y^J and w 2 = y^J the increment vectors of the walk 

and by $ the transition operator defined, for any function / on N 2 and for any 

x 



by 



$(f)(v) =x f(v + Wl ) - f(v) +y f{v + w 2 ) - f(v) 



then 



E^(/(C/ DT (n + l)) = (l + 



u + nS 



(f)(U vl (n)) 



where {T n ,n > 0) is the filtration associated with the process (U DT (n),n > 0). 
In particular, 



m(u DT {n + l) I U DT {n)) = (l + 



A 



u + nS 

where I denotes the two dimensional identity matrix and 



U ul \n) 



(7) 



A:= l R 



a c 
b d 



2.2 Asymptotics of the discrete time process U DT (n) 

As mentioned in Section [T] and briefly recalled hereunder, a discrete time Polya- 
Eggenberger urn process has two different kinds of asymptotics depending on the 
ratio of the eigenvalues of its replacement matrix R. With our notations, these 
eigenvalues are S and 

m := a — c = d — b. 
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Let us denote by u\ and u 2 the two following linear eigenforms of A respectively 
associated with the eigenvalues S and m (i.e. U\ o A = Sux and u 2 o A = mu 2 ): 

ux(x,y) = -(x + y) u 2 (x,y) = -(bx - cy) (8) 

and denote by (^1,^2) the dual basis of (ui,u 2 ): 

" i= i^)(0 *=^(ii) < 9 > 

The vectors Vk are eigenvectors of A and the projections on the eigenlines are 
U1V1 and u 2 v 2 . 

For any positive real x and any nonnegative integer n, if one denotes by 7 Xjn 
the polynomial 

n— 1 ^_ 

the matrix 7« jTl (^) in nonsingular and it is immediate from (7) that / yf,n(4<)~ 1 U DT (n) 
is a (vector-valued) martingale. 

We denote the ratio of R : s eigenvalues by 

m 

The case of small urn processes (i.e. when o < 1/2) has been well studied; in this 
case, when R is not triangular, the random vector admits a Gaussian central limit 
theorem (see Janson [9]). Triangular replacement matrices impose a particular 
treatment and lead most often to a nonnormal second-order limit (see Janson [10] 
or [20]). 

Our subject of interest is the case of large urns, i.e. when a > 1/2. In this 
case, ^U DT (n) is a large Polya process with replacement matrix -^R in the sense 
of [19] . As a matter of consequence, the projections of the above vector- valued 
martingale on the eigenlines of A, which are of course also martingales, converge 
in any L p , p > 1 (and a.s.). In particular (second projection), 

M DT (n) := -^—u 2 (U DT (n)) 

is a convergent martingale; since 7^(0") = w^ pff » (1 + o(l))), denoting by 

W OT := lim — M2 (f/ DT (n)), (10) 
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a slight adaptation of [19] leads to the following theorem. Note that this theorem 
was essentially proven by Athreya and Karlin ([TJ) and Janson ([9]) for random 
replacement matrices. The convergence in L p -spaces when R is nonrandom is 
shown by the indicated adaptation of 



Theorem 2.1 Suppose that a G]l/2, 1[. Then, as n tends to infinity, 

U DT {n) = nv x + n a W DT v 2 + o(rf) 

where v\ and v 2 , defined in M) ; a re eigenvectors associated with the eigenvalues 
S and m; W DT is defined by (fifl|i; o( ) means a.s. and in any L p ,p > 1. 

2.3 Parametrization and hypotheses 



The subject of the paper is W DT, s distribution in Theorem 2A so that the Polya 
urn process will be supposed large. Furthermore, the replacement matrix R will 
be supposed not triangular because this case has to be treated separately with 
regard to its limit law as it is attested by Janson [ID] . Flajolet et al. [8j, [20] and 
the present paper. 

a b 



In these conditions, the assumptions on the replacement matrix R 



c d 



are: a+b = c+d = 5 (balance condition), 5/2 < m = a—c = d—b < S (large urn) 
and b, c > 1 (not triangular). Because of the balance condition, the parametriza- 
tion of Polya urns is governed by three free parameters. The computation of 
W DT 's Fourier transform will show in Section |p\3l that a natural choice for these 



parameters consists in keeping the 3-uple (m, 5, b). The assumption "large and 
non triangular" is equivalent, in terms of these data, to the following: 



R 

with 



a b \ _ f S-b b 
c d I \ S—m—b m + b 



m + 2<S<2m-l ( . 

Kb< S-m-1 * 



Note that these inequalities imply 5 > 5 and m > 3 and that, for a given m, the 
point (m,b) belongs to a triangle as represented in Figure 1. 

For small values of 5, large urn processes have the following possible re- 
placement matrices: for 5 G {1,2}, only R = Sld.2 defines a large urn; for 
5 G {3,4}, all large urns have triangular matrices. For 5 G {5,6}, only R = 
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b 



m-1 

























/\\\\\\ 















m+2 2m 

Figure 1: Parameters (b, S) for a given m. 



S-l 1 
1 S-l 



defines a non triangular large urn. For S = 7, apart from tri- 



6 1 

angular or symmetric matrices, the only large urn have I ) as replacement 



2 5 

matrix (or the other one deduced from this one by permutation of coordinates). 

3 Embedding in continuous time and martin- 
gale connection 

3.1 Embedding 

The idea of embedding discrete urn models in continuous time branching processes 
goes back at least to Athreya and Karlin A description is given in Athreya 
and Ney ([2], section 9). The method has been recently revisited and developed 
by Janson ([§]). 

We define the continuous time Markov branching process (U CT (t),t > 0) as 
being the embedded process of (U DT (n),n > 0). Following for instance Bertoin 
([3], section 1.1), this means that it is defined as the Markov chain in continuous 
time whose jump TcltG clt any nonzero point (x,y) G N 2 is the finite (probability) 
measure given by the transition probability of the discrete time process (For- 
mula (pi)). One gets this way a branching process whose dynamical description 
in terms of red and black balls in an urn is the following. In the urn, at any 
moment, each ball is equipped with an £a;p(l)-distributed2 random clock, all the 

5 For any positive real a, Exp{a) denotes the exponential distribution with parameter a. 
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clocks being independent. When the clock of a red ball rings, a red balls and b 
black balls are added in the urn; when the ringing clock belongs to a black ball, 
one adds c red balls and d black balls, so that the replacement rules are the same 
as in the discrete time urn process. 

The successive jumping times of (U CT (t),t > 0), will be denoted by 

= T < Ti < • • ■ < T n < ■ ■ ■ 

The nth jumping time is the time of the nth dislocation of the branching process. 
The process is thus constant on any interval [r n , r n+ i[. 
In the whole text, 

(t/gy*),t>o) 

will denote the process starting from the nonzero vector (at, (3). Thus, for any 
initial condition (a, (3), for any t > 0, 

Ug 0) {t) = Uffi l) (a(t)) 

where 

a(t) := min{n > 0, r n > t}. 

Lemma 3.2 1) for n > 0, the distribution of r n+ i — r n is £xp(u + Sn) where u 
denotes the total number of balls at time 0; 

2) the processes (r n )„> and (U CT (T n )) n > are independent; 

3) the processes {U CT (T n )) n > and (U DT (n)) n >o have the same distribution. 

Proof. The total number of balls at time t G [r n , t„ +1 [ is u + Sn. Therefore, 1) 
is a consequence of the fact that the minimum of k independent random variables 
£a;p(l)-distributed is £a;p(/c)-distributed. 2) is the classical independence between 
the jump chain and the jump times in such Markov processes. The initial states 
and evolution rules of both Markov chains in discrete time and in continuous time 
are the same ones, so that 3) holds. ■ 



Convention : from now on, thanks to 3) of Lemma 3.2, we will classically consider 
that the discrete time process and the continuous time process are built on a 
same probability space on which 



(U CT (r n )) n > = (U DT (n)) n > a.s.. (12) 
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3.2 Asymptotics of the continuous time process U CT (t) 

Let V\ and v% the linearly independent eigenvectors of A denned by ([9]). In the 
case of large urns, the asymptotics of the continuous time process (U (t))t>o is 
given in the following. 

Theorem 3.3 (Asymptotics of continuous time process) 

When t tends to infinity, 

U CT (t) = e st £ Vl (1 + o(l)) + e mt W CT v 2 (1 + o(l)) , (13) 

where £ and W are real-valued random variables, the convergence being almost 
sure and in any L p -space, p > 1. Furthermore, £ is Gamma^/S 1 )- distributed, 
where u = a + (3 is the total number of balls at time 0. 

Proof. The embedding in continuous time has been studied in Athreya and 
Karlin pQ and in Janson [9]. It has become classical that the process 

(*- ,A v cl '(*)),>„ 

is a vector- valued martingale and that, in the case of large urns (o~ > 1/2), this 
martingale is bounded in L 2 , thus converges. Its projections on the eigenlines 
Mvi and Ri>2, i.e. respectively 

e- st Ul {U CT (t)) and e~ mt u 2 {U CT (t)) 

are also L 2 -convergent real valued martingales, thus converge almost surely. Their 
respective limits are named £ and W . What still has to be proved is that these 
martingales converge in fact in any L p , p > 1. The identification of £'s distribution 
will be a consequence of this proof. 

The infinitesimal generator of the Markov process (U CT (t)) t is the finite- 
difference operator 

®(f)(x, y) = x {f(x + a,y + b)- f(x, y)} + y {f(x + c,y + d)- f(x, y)} , 

defined for any (measurable) function / and any (x, y) G IR 2 . For a very synthetic 
reference on semi-groups of Markov continuous-time processes, one can refer to 
Bertoin ([3j, chapter 1). This operator $ acts on two- variables polynomials. This 
action has been studied in details in [19] in a more general frame. More precisely, 
for any integer d > 1, the operator $ acts on the finite-dimensional space of 
polynomials of degree less than d, so that, for any two- variables polynomial P 
and for any t > 0, 

IE (P(U CT (t)) = exp (t$) (P) {(U CT (0)) (14) 
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where, in this formula, $ denotes the restriction of $ itself on any finite-dimensional 
polynomials space containing P. The properties of $ listed in the following lemma 
are proven in [12] and will be used here. 

Lemma 3.4 There exists a unique family of polynomials Q p>q G M[a;,?/] ; p, q 
nonnegative integers, called reduced polynomials, such that 

(1) Qo,o = 1; Qi,o = ui and Qo,i = U2 (see $M) for a definition of eigenforms u\ 
and ii2 ); 

(2) <&(Q PA ) = (pS + qm)Q p q for all nonnegative integers p, q; 

(3) u^ul — Q p ,q G Span{<5 p ', g ', p'S + q'm < pS + qm} for all nonnegative integers 
p,q. 

Note that the reduced polynomial Q p , q is in fact the projection of u\u\ on a suit- 
able characteristic subspace of $'s restriction to some finite dimensional poly- 
nomial space, and that this spectral decomposition of $ on polynomials has a 
particularly simple form (it is diagonalizable) because the urn is large and two- 
dimensional. See HH] for more details. 



Formula (14) and Property (2) of Lemma 3.4 lead to 



V(p,g) G Z| , IE (Q M (U CT (t))) = e^ s+ ^ x Q m (U CT (0)) . 



This implies straightforwardly, with (3) of Lemma 3.4, that, for any (p,q) 



IE (ulu\ (U CT (t))) = e^ s+ ^ x Q M (U CT (0)) + o ( e ^ s+ ^) . (15) 

In particular, the martingales e~ st u\ (U CT (t)^j ande~ mt U2 (U CT (t)) are L p -bounded 
for any p > 1 and their respective limits, namely ^ and W CT satisfy, for any non- 
negative integer p, 

W = Q P ,o (U CT (0)) and IE (W CT ) P = Q , P (U CT (0)) . (16) 

The convergence part of the theorem follows now from the spectral decomposition 
of A: for any t > 0, 

U CT (t)= Ul (U CT (t)) .v 1+ u 2 (U CT (t)).v 2 . 

Besides, it is proven in [19j, or one can check it after an easy computation, that 
the reduced polynomials corresponding to the powers of U\ are given by the close 
formula 

Q P ,o = ui (ui + 1) {ui + 2) • • • (ui +p - 1) . 
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Thanks to Formula (16), this shows that the p-th moment of £ is, for any integer 
V > 0, 

u fu \ (u \ ( u \ r ( "% + p ) 

E £ = o o + 1 o + 2 • • • o + P " ! 



s \s J \s J \s * ) r(f) 

where u is the total number of balls at time (remember that Ui(U CT (0)) = 
u/S, see (|8|). One identifies this way the required Gamma(n/.S') distribution, 
characterized by its moments. ■ 

Remark 3.5 Notice that the distribution of £ has been given by Janson (JM/) 
calculating first the distribution of Ui(U CT (t))for every t: 

Ul (U CT (t)) = ^ + Z(t) 

where Z(t) is a negative binomial distribution. 

Remark 3.6 Reduced polynomials Q$, p do not have a known closed form, so that 
reproducing the above method in order to compute the moments of W CT fails. 

Remark 3.7 It follows from the proof that the real-valued random variables £ 
and W are respective limits of the martingales 

£ = lim e~ st Ul (U CT (t)) , 

W CT = \^e- mt u 2 (U CT (t)). 
They are not independent and their joint moments are computed from Formula 



(15): for any nonnegative integers p, q, 

E[(O P (W CT y]=Q p , g (U CT (0)). 

For example, their respective means are = u\{U CT '(0)) = ^(a + 0) and 
EW CT = u 2 (U CT (0)) = ±(ba- cp), whereas 



E (£W CT ) 



CT , (a + (3 + m)(ba — cf3) 



S 2 



as can be shown by computation of the reduced polynomial Qn = (ux + o~)u2 (one 
can directly check that this polynomial is an eigenvector of $ ; associated with the 
eigenvalue S + m). 
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Remark 3.8 When the urn is small (a < 1/2), the same method shows that 
the result on the first projection is still valid: the martingale {e~ st U\{U CT {t))) t 
converges in any L p (p > 1) to a Gamma(u/ S) distributed random variable. On 
the contrary, the martingale {e~ mt U2(U CT (t))) t diverges and it is shown in Janson 
that the second projection satisfies a central limit theorem: when a = ^ < 1/2, 

e~K 2 (U CT (t)) ^ M 

where M is a normal distribution. In the case a = 1/2, the normalization must be 
modified and one gets the convergence in law of \/te~ st ' 2 U2{U CT {t)) to a normal 
distribution. 

Remark 3.9 The distributions of the W CT are infinitely divisible, because they 
are limits of infinitely divisible ones, obtained by scaling and projection of in- 
finitely divisible ones. Indeed, infinite time, the distributions of the UP^^Jt) are 
infinitely divisible. It has been said by Janson ([9], proof of Theorem 3.9 ). With 
our notations, it relies on the fact that 

where a continuous time branching process (with the same branching dynamics as 
before), starting from real (non necessary integer) conditions, is suitably defined. 



3.3 DT and CT connections 



Apply the first projection to the embedding principle (12): 

u 1 {U CT {r n ))=u 1 {U DT {n)) 



a.s.. 



By definition (J8J) of Mi, this number is ^ times the number of balls in the urn at 
time n, which equals g(tt + Sn) = n(l + o(l)). Since stopping times r n tend to 
+oo, renormalizing by e~ Sr " and applying the convergence result of Section 3.2 
leads to 

f = lim ne' Srn . (17) 

n— >+oo 



Apply now the second projection to the embedding principle (12): 

u 2 (U CT (r n )) = u 2 (U DT (n)) 
Renormalizing by e~ mTn implies that 



a.s.. 



u 2 {U CT {r n )) = W^\r n ) 



CTi 



{a)M 



n 



a.s. 
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which is a "martingale connection" in finite time. 



Thanks to (17) and Theorem 3.2, passing to the limit n — > oo leads to the 



following theorem, already mentioned in Janson [U] in a more general frame. Note 
that the independence between £ and W DT comes from Lemma 3.2, 2). 



Theorem 3.10 (Martingale connection) 



W 



CT 



DT 



a.s. 



£ and W being independent. 

The distribution of £ CT is invertible^J so that any information on W CT can be 
pulled back to W DT thanks to connection (18). 



4 Dislocation equations for continuous urns 

4.1 Vectorial finite time dislocation equations 

By embedding in continuous time, the previous section provided a branching 
process (U^^(t),t > 0). The independence properties of this process imply that 
it is equal to the sum of a copies of U?^L{t) (the process starting from one red 
ball) and (3 copies of U£^Jt) (the process starting from one black ball). We are 
led to study these two M 2 -valued processes. 

Let us now apply the strong Markov branching property to these processes: 
let us denote by r the first splitting time for any of these processes (they have the 
same £xp(l) distribution). We get the following vectorial finite time dislocation 
equations: 

f U$»(t) =[a+ l]U^ 0) (t -t) + [bp^it - t) 
Vt>r, <^ (19) 
{ Uffcfi) £ [c]U^(t -r) + [d+ l]U^(t - r), 

where the notation [n]X + [m]Y denotes the sum of n copies of the random 
variable X and m copies of the random variable Y (n and m are nonnegative 
integers) . 

6 A probability distribution A is called invertible when, for any probability distributions A 
and B, the equation AX = B admits a unique solution X independent on A, see for instance 
Chaumont and Yor [4j. The invertibility of any power of a Gamma distribution can be shown 
by elementary considerations on Fourier transforms. 
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Remark 4.11 The above equations could be written with a.s. equalities. Taking 
a probability space of trees is more comfortable. The price to pay is just to write 
the different processes for each subtree with different indexes and to distinguish 
the two splitting times for the two starting situations. 



4.2 Limit dislocation equations 

Remember that (e- mt u 2 (U^ T 0) (t)) t and {e- mt u 2 {Uf Q T 1) {t))) t are martingales whose 
expectations are w 2 (^qn(0)) = b/S and ■u 2 (f/^- ) (0)) = —c/S respectively. They 
converge in L p for every nonnegative integer p > 1. We are interested in the 
probability distributions of 



X : = lim e _m *M 2 (^ao)(*)) and Y : = lim e _m Wl7g ( 20 ) 

Projecting along the second eigenline, scaling and passing to the limit in Sys- 
tem (19) lead straightforwardly to the following proposition. 

Proposition 4.12 The limit random variables X and Y are solution of the fol- 
lowing (scalar) limit dislocation equations: 



X 



Y 



with 



E(X) 



b_ 

S 



[a + 1]X + [b]Y 
[c]X + [d+l]Y 

E(y) = - 



(21) 



c 



(22) 



where all the mentioned variables are independent. 



Remark 4.13 Janson ([9]) in his Theorem 3.9 gets the same limit dislocation 
equations. He obtains the unicity of the solution in L 2 by a fixed point method. 
Hereunder in Section 6.3 , calculating explicitely the solution of the fixed point 
system (21) together with conditions (22), we give in passing another proof of the 
unicity in L 2 . 
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5 Characteristic functions: fundamental differ- 
ential system 

Let T and Q be respectively X's and K's characteristic functions: 

/+oo 
e ixt d/i x (t) 
-oo 

with a similar formula for Q . Since X and Y admit moments of all orders, T and 
Q are infinitely differentiable on R. 

Proposition 5.14 The characteristic functions T and Q are solutions of the 
differential system 

F(x) + mxT'(x) = F{x) a + x Q{x) b 

(23) 

G(x) + mxQ'{x) = F(x) c g(x) d+1 
and satisfy the boundary conditions at the origin 

F{x) = l + i±x + 0(x 2 ) 

(24) 

Q{x) = 1 -ip + 0(x 2 ). 
Proof. Conditioning by r whose distribution is exponential with mean 1, the 



first dislocation equation (21) implies successively that, for any x G 



F{x) = E^E^exp (ixe- mr ([a + 1)X + [6]y)|r)JJ 

r+oo 

= / F a+l (xe~ mt ) G b (xe~ mt ) e^dt. 
Jo 

After a change of variable under the integral, this functional equation can be 
written 

r f x dt 

v^o, f(x) = —— t ? a+ \t)g\t)——. 

m\x\ 1+ ™ Jo \t\ 1/m 

Differentiation of this equality and the similar one obtained from the second 



dislocation equation in (21) lead to the result. The boundary conditions come 
elementarily from the computation of X's and K's means and from the existence 
of their second moment (Taylor expansion of T and g at 0). ■ 



Remark 5.15 The differential system (23) is singular at so that the unicity 



of its solution that satisfies the boundary condition (24) is not an elementary 



consequence of general theorems for ordinary differential equations. 
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6 Resolution of the fundamental differential sys- 
tem 

6.1 Change of functions: heuristics 

Formally, without carefully checking which m-th roots should be considered, if 
the variables x G R and wgC are related by x s w m = 1, the change of functions 

f(w) = x^J-'(x) 
g(w) = x™Q(x) 



reduces the problem ( 23 ) and ( 24 ) to the regular differential system 



/' = ff a+1 9 b 



(25) 



with boundary conditions at infinity 



fW 



9{w) 



_l . I. _ l + m ~~ I I | _ l + 2m 

w s + i±w s +u \ \w\ - : 



w s 



l + m 

S 



+ 0( 



■ _ l + 2m 
W\ S 



(26) 



The basic fact for the resolution of (25) is that it admits l/g m — l// m as first 
integral: if K is any complex number such that the constant function l/g m — l/f m 
equals 1/K m , then g m can be straightforward expressed as a function of / and 



(25) implies that / is solution of the ordinary differential equation 



b/m 



K S+1 



n s+1 



(*) 



(27) 



with boundary conditions coming from (26). 



This leads to consider a primitive of the function z i— > (1 + z m ^ b / m / z s+l in the 
complex field. 
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6.2 Abelian integral / and its inverse J 

For all integers m, S and b that satisfy S > 5, S/2 <m<S,l<b< S/2, we 
denote by / = I m s b the function 



I(z) = j (l + u m ) 



du 

v S+l 

,zoo) 



+oc r i - dt 



f S+l' 



where [z, zoo) denotes the ray {tz, t > 1} and where the power 1/m is used 
for the principal determination of the m-th root. The function / is an Abelian 
integral on the curve x m — y m = 1 (which is isomorphic to the famous Fermat 
curve x m + y m = 1 by a straightforward linear change of variables) , defined on 
the open set 

O m = C\ |J M> e^ 1+ ^. 

pe{0,...,m-l} 

Note that the integral is convergent because S — b + 1 > 3. Let S m be the open 
sector of the complex plane defined by 

S m = {zeC\{0}, - - < arg(^) < -}. 

m m 

The open set O m is the union of 5 m 's images under all rotations of angles 2kn /m 
around the origin, k G Z. 

In the following, the notation ( b ^™) denotes the ordinary binomial coefficient, 
generalized for rational (or even complex) values of b/m by Euler's Gamma func- 
tion. As everywhere in the paper, the positive integer a is a = S — b. 

Proposition 6.16 (Properties of /) 

1- / is holomorphic on O m and for any z G O m , 



[1 + z m ) 



b 



''(*) = ~—s^- ( 28 ) 



z 



2- For any m-th root of unity uj and for any z G O m , 

I(loz) = u~ s I(z). (29) 

3- The function I admits a power series expansion in the neighborhood of infinity 
in any connected component of O m . On S m , this expansion is given by the formula 

I{z) = V ( b/m ) z— = J- + -L + . . . , (30) 

v > ^ a + mn V n J az a - ' 

n>0 V 



m(a + m) z a+rn 
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valid for any z G S m , \z\ > 1. 

4- The function I admits a Laurent series expansion in the neighborhood of the 
origin in any connected component of O m . On S m , this expansion is given by the 
formula 



1 , b 1 \ Cu T ( b/m ) Zmn ' S 
Sz s m(S — m) z s ~ m 4^ V n ) mn ~ & 



n>2 

where C is the constant 



^o = EM f— + (32) 

n>0 V 



mn mn — S 



Formula (31) is valid for any z G S m , \z\ < 1. 
5- C < 0. 



Proof. 1- and 2- are direct consequences of J's definition. Expansion (30) and 
its validity for z G S m , \z\ > 1 comes directly from the power series expansion of 
£ i — > (l + £) b / m in J's definition. Its validity for \z\ = 1 is given by the convergence 
of the series at such a z and application of Abel's theorerrj^J proving 3-. To prove 



expansion (31 ), notice first that 1 is holomorphic on the simply connected domain 
S m and I'(z) tends to as z tends to infinity, so that integration on the segment 
[z, zoo) is equivalent to integration on [z, 1] followed by integration on [1, +oo). 
Thus, 

/(,)=/(!)+ / 

J [2,1] « 



Power series expansion of u i— ► (1 + u) b / m under this last integral leads then to 



(31). The proof of 4- is again made complete by application of Abel's theorem. 



Note that, since S is not a multiple of m because of our assumptions on the 



parameters, the denominators in Formula (31) do not vanish. Finally, if a 



denotes the general term of the series (32), a straightforward computation shows 
that 

(S — a)(m 2 + aS)(S — a — m) 

ao + Qfi = ^ Jy — r — '- < o, 

amb{a + m){b — m) 

the last inequality coming from S — a — m < S — S/2 — S/2 = and from 
the other hypotheses on the parameters. Furthermore, a 2n + &2n+i < for any 
n > 1, which concludes the proof. [Hint : compute a 2n + (*2n+ii factorize ^l™ 1 ) 



7 We refer to the following theorem of Abel: if a series J2 n a n ^ s convergent, then the power 
series a n z n converges uniformly on the segment [0, 1]. 
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by (.Jjy , use the fact (2n + l)/(2n - b/m) > 1, notice that (^J > because 
0<b/m=(S-a)/rn<S/2m<l.) ~ ' ■ 

Let EI denote Poincare half-plane: 

I = {zeC, 3(z)>0} and U={z,zeU}. 

Proposition 6.17 The analytic function I : «S m PH — ► C is a conformal map- 
ping onto the open subset 



U = {z, 
(see Figure 2), where 



a7r . , , / T . Sir . . an . 

— < arg(z) < 0} U [ Ii + {z, < axg(z) < } 

m V m m 



1 , Qj d , iair 

h '■= —By — , — )e m 
m mm 



(33) 



and where B denotes Euler's Beta function B(x,y) = T(x)T(y)/T(x + y). 



in/tn / 






^r — — — — — — — 

U-- 







Figure 2: Domain S m f] H and its image by /. 



Proof. Let's denote ( m = exp(i7r jm). We show hereunder that the restriction 
of / to the sector S m fl Cl(M) (where Cl(M) denotes the topological closure of 
HI) admits a continuous continuation to the ray {t( m ,t > 0} and that this con- 
tinuation maps homeomorphically the boundary of the sector S m fl EI onto Ws 
boundary. The result is then a consequence of elementary geometrical conformal 
theory (see for example Saks and Zygmund |2*2]). 
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Let ft, e H, r > 0, t > 1 and z = r(l — h)( m . When h tends to 0, then 
1 + (tz) m = 1 — r m t m + mr m t m h + 0(h 2 ) so that the value of m-th root principal 
determination of l + (tz) m according to the sign of 1 — (rt) m leads to the respective 
limits in terms of Beta's incomplete functions: 



if r > 1, then 



if r < 1, then 



lim I(z) 

in 



m 



1/r" 



(1 - u) b/m u c/m du] 



lim 

z— >rCm, 26S„ 



I{z) = h + 

m 



l/ r r, 



(u - l) b/m u c/m du. 



(34) 



(35) 



The complex number I\ is simply 

h = 



lim 

n 



Hz)- 



Formula (33 ) is a consequence of the integral representation of Euler Beta function 



B(a,(3) = Jq^I — u) a ~ l u l3 ~ l du. The monotonicity of real integrals (34) and (35) 
with respect to r show that the continuous continuation of I defined by these 
formulae maps descreasingly the ray ]0, +oo[ onto itself and respectively the ray 
]0, C m ] onto the ray {] I\ + t(~ s , h],t > 0} and the ray [Cm, CmOo) onto [h, 0[. ■ 

Remark 6.18 By computation in the realm of hypergeometric functions, one 



shows that the numbers Co defined by (32) and I\ defined by (35) are related by 



sin7r(l + b/m) 



h\ 



1 sin7r(l + b/m) ( S — b m + b 



B 



sin 7r(l + S/m) m sin7r(l + S/m) \ m m 

Definition 6.19 Let J = J mt s,b '■ C\] — oo, 0] — > S rn the only continuous function 
defined by: 

• \/z 6 H ; J(z) = I^ 1 {z) in the sense of Proposition 
of U so that this functional inverse exists); 

• Vz e H, J(z) = J(z) (complex conjugacy). 



6.11 (H is an open subset 



The properties of / shown in Propositions |6.16 and 6.17 imply that J is a 
conformal mapping between C\] — oo, 0] and an open subset of S m (use Schwarz 
reflection principle), that maps EI into S m r\M and H into S m r\M. If C denotes the 
inverse of the negative real axis by J's restriction to S m D H, then the boundary 
of J's image is C UC U {0} (see Figure 3). Furthermore, the restriction of J to the 
positive real half-line is the inverse of J's and J is the unique analytic expansion 
of (/|]o,+oo[) _1 to the slit plane. Naturally, the formula J(z) = J(z) is valid when 
z is any nonnegative complex number. 
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Proposition 6.20 For any negative real number x, both limits 

lim J(z) and lim _J(z) 

exist, are nonreal and conjugate (thus different). 



Proof. Direct consequence of J's preceding properties and Proposition 6.17 
Figure 3). 

We adopt the following notation: 



sec 



Vx < 0, 



J(ar-) = lim _ J(z) £ S m n H 



J(ar+) = lim J(» e5 m nl 

2— >o;, zSH 



(36) 





Figure 3: Action of J on the slit plane C \ 



Proposition 6.21 The function J admits, as z tends to infinity in the slit plane 
C \ R_ ; an asymptotic Puiseux series expansion at any order in the scale 

(-) , (p,q)eN 2 
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where all fractional powers denote principal determination. The beginning of this 
asymptotic expansion is 



J(z) 



Sz 



m + l 

m(S — m) \Sz J ° V Sz 



S+l 

3 



I J S ■ (37) 



PROOF. Reversion formula J o / = Id from expansion (31 ). 



6.3 Computation of characteristic functions 

This section gives an explicit closed form of characteristic functions T and Q for 



dated with the replacement matrix R 



in terms of the just defined 



the elementary continuous time urn processes X and Y (defined in (20)) asso- 

a b 
c d 

functions J. Remember: the urn is supposed to be large and non triangular so 
that b > and c > 0. Let k be the positive number defined by 



S 



m(S — m) 



(3? 



Theorem 6.22 The characteristic functions T and Q are the unique solutions 



of the differential system (23) that satisfy boundary conditions (24)- They are 
given by the formulae 



Vx > 0, < 



G(x) 



'•«■ " -'w.S.h | < I, ' s X 



(39) 



ne^mx ™J m ,s,c 



C 



iirS 
K^e 2m 



s 



-x 



and 



Vx G E, F{-x) = T(x), G(-x) = Q(x). 



(40) 



Proof. 1- We first solve (23) on R >0 . Let F and G be solutions of (23) that 
satisfy (24). Let's do the change of variable x G M>o — ► w — x~ s ^ m G M>o and 



the change of functions 

f(w) = w~ 1/s F (w- m/s ) and g(w) = uj- 1/s G (w' m/s ) 

that is straightforwardly reversed by formula F(x) = x _1//m /(x~ 5//m ) and a similar 
one for G and g. Then, / and g are solutions of (25) on IR >0 and satisfy boundary 
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conditions (26) at +00. In particular, since (25) is a nonsingular differential 



system, Cauchy-Lipschitz theorem guarantees that if (/, g) is any solution, then 
/ (resp. g) is identically zero or does not vanish. This implies that / and g do 
not vanish on JR>o- Because of balance conditions a + b = c + d, differentiation of 
l/g m — l/f m leads to the fact that this function is constant on M>o (first integral). 



Furthermore, boundary conditions at +00 (26) imply that this constant value is 



-(b + c). If K denotes the complex number 



K = Kexp(— 



2m' 



[k > has been defined by Formula (38)), this shows that 

\/w > 0, ' ' ' 



g m (w) f m (w) K r 



(41) 



Since f/g is continuous on M>o, does not vanish and tends to 1 at +00 (26) 
relation (f/g) m = 1 + (f/K) m implies that, on M >0 , 



/ 



(i + GT) 



l/m 



(42) 



(principal determination of the m-th root). Reporting in (25)'s first equation 
shows that / is necessarily a solution of Equation (27) on R >0 . Boundary con- 
ditions (26) imply that, when w tends to +00, hf(w) ~ 'L e m l 2m w - 1 l s g 5, 



so 



that Equation (27) can be written 



± I (fH 



~S~' 



in a neighborhood of +00. Integration of this equation shows that 

I m ,s,b (J^j ( w ) = + Ci 

in a neighborhood of w — +00, for a suitable complex constant C\. The deter- 
mination of C\ is made by means of local expansions: since / tends to at +00, 



using (31) and previous equality leads to 
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when w tends to +00, so that boundary conditions (26) lead to the equality 



C\ = Cq. Note that this computation makes use of (26 )'s big-O, of the assumption 



1 — 2m/ S < (large urn) and of the relation S — m = b + c. Thus, necessarily 

K s 



fW 



KJ, 



■m,S,b 



C n 



S 



■w 



(43) 



for any w in a neighborhood of +00. The function w — > KJ m S b (C Q + K s w/S) is 
well defined on IR >0 because C < (Proposition 6.16, 5-) and — ir < Arg(K s ) < 
— 7r/2, so that it is the only maximal solution on IR>o of Equation (27) that 



satisfies the first equation of (26). This shows finally that 



K s _s 

VX > 0, F(X) = KX rn J m S b ( C + —X rn 



Since — K T 



K , the same arguments show that, for any w > 0, 

I /\ 

g(w) = KJ mtS , t 




which shows completely Formula (39). 

2- The resolution on IR<o is made the same way. To this effect, let's do the new 
change of variable x G M<o — > w = \ x \- s / m e ™ s / m ^ R >0 e mS ^ m . Let's do as well 
the change of functions 



f( w ) = e ~ i7T/m \w\~ 1/s F (-\w\~ m/s ) and g{w) = e - i7r/m \w\- 1/s G (- 



These changes of variable and functions are reversed by the formulae x = —\w\ m ^ s 
and F(x) = e l7T / m \ x \~ 1 / m f(e lnS / m \x\~ s / m ) with a similar formula for G and ^. 



Functions / and g are still solutions of ( 25 ) but boundary conditions become, as 
\w\ tends to infinity, 



^(wj 



1-ifH 



o 



w\ 



2 m 

s 



2rn 

s 



(44) 



This implies that First integral (41) is still valid (same K) and, since / and g are 



still equivalent at infinity, Relation (42) is satisfied. Boundary conditions (44) 



imply that, when w tends to +00, hf{w) 



\w\ 



-l/S p -m/2m 



G S m . Consequently, 
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the same arguments as before show that Formula (43) remains valid (note that 



Cq + wK s I S G EI so that this formula is well defined for any w). This shows that 



Vx < 0, F(x) = Ke™ \x\ m J m ,s,b[C + —e l7r ^\x\ ™ 



K s 



■ s , . s 



Since Ke l7T ^ m = K, one gets finally F(—x) = F(x) for any real number x. The 
proof of the whole theorem is made complete by the same arguments for G. ■ 



Remark 6.23 Formula (40) on characteristic functions comes directly from the 



fact that X and Y are real-valued random variables. 

We want to know more about the analyticity properties of T and Q around 
0. Let (p = (p mt s,b be the function defined by the formula 

<p(z) = KZ-V m J m , s , b (c + !j {z^ m ) S ^j (45) 
where the power 1/m denotes the principal determination of the m-th root. Note 



that k and Co, respectively defined by Formulae (38) and (32) are functions of 
m, S and b too. If p denotes the positive number 



P 



s\c \\ - m/s _ s l - s i m \c \- m ' s 

k s J m(S — m) 



it follows from the properties of J that ip is defined and holomorphic on the open 
set 

V = C\{(-oo,0]U[p,+oo)}. 

Furthermore, the characteristic functions T and Q are restrictions of ip functions 
on the imaginary axis: for any 

F(x) = <p m ,s,b(ix) and Q{x) = <p m ,s,c(- ix )- 

Note that k is a function of (m, S) so that the same k appears in both functions 
<p> m ,s,b and ip> m> s,c (the C 's and the p's are however different). 

Proposition 6.24 The function ip, holomorphic on V , cannot be analytically 
extended on a larger subset ofC However, setting <p(0) = 1 defines a continuously 
differentiate extension of ip on V U {0}. 
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Proof. The half-line [p, +00) is the locus of complex z such that Cq+^- (^~ 1 / m ) 
is a real nonpositive number (remember that m < S < 2m). Since the principal 
determination of the m-th root is well defined and nonzero in a neighbourhood 



of this half-line, Proposition 6.20 implies that ip cannot be continuously extended 
at any point of [p, +00). 

If £ is a negative number, definition of the principal determination of the m-th 
root leads to the existence of both limits 



l m \x\ rn J ( C + 



lim (p(z) = e 
z-*x, zee 

lim _ip(z) := ip(x— ) = ip(x+). 

z—>x, zeH 



~s 



-4ES, ,_s_ 

e m \x\ m 



Since the image of J is included in S m , the limit (p(x+) belongs to the open sector 
e~ t ™S m which contains no real number, so that tp(x+) 7^ (p(x—). This shows that 
ip cannot be continuously extended at any point of E<o- 

When z tends to in the slit plane C \R<o, Proposition 6.21 shows that <p(z) 
tends to 1. One step more, computing the derivative of (p in terms of J using the 



algebraic expression of I' (28) implies, with expansion (37), that 



lim (p 

z^O, zeC\M< 



b 

S' 



Corollary 6.25 The exponential moment generating series 

and ^^1 T > 



pi 



have a radius of convergence equal to 0. 



pi 



Proof. These series are the Taylor series of <p m ,S,b and <p mt s,c at 0. If these radii 
were positive, these functions could be analytically extended to a neighborhood 
of the origin. ■ 



Remark 6.26 The singularity of •p at the origin is thus not due to ramification 
but to a divergent Taylor series phenomenon. Indeed, the apparent ramification 



coming from the m-th root at the origin in Formula (45) is compensated by both 



Puiseux expansion (37) and the S-th power of the m-th root in J's argument of 
ip 's definition. 
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7 Density of W 



CT 



Notice, with the notation (36) that 



F{x) ~ /cJ(Co-)ar~™, (46) 

£— >+oo 

where the non-real complex number J{Cq—) is different from (see Figure 3). 

A first consequence is that T(x) tends to when x goes off to +oo. Hence the 
probability distribution function of W CT is continuous so that W CT, s law has no 
point mass. 

A second consequence is that T is not in L 1 so that W s distribution cannot 



be obtained by classical Fourier inversion. Nevertheless, we obtain in Section [773 
an expression of this density using the derivative of the characteristic function 
T. Before, we need firstly to ensure that the support of W CT is the whole real 



line R which is proven in Section 7.1 and secondly to ensure that W admits 



a density which is proven in Section 7.2 using the martingale connection (18). 



As usually, this kind of connection induces a smoothing phenomenon between 
W DT and W CT , allowing us to prove that W CT has a density, whatever W s 
distribution is. 



7.1 Support of W CT 
Proposition 7.27 The support ofW CT is 



Proof. As in (20), let X denote the random variable W CT starting from one red 



ball. Because of the branching property (see beginning of Section 4.1), it suffices 
to prove that A's support is the whole real line BL General results on infinite 
divisibility (see for instance Steutel [21] p. 186) ensure that the support of an 
infinitely divisible random variable having a continuous probability distribution 
function is either a half-line or K. Suppose that the support of X is [a, +oo[ for 
a given real number a. Then denoting X's distribution by /ix, 

r+oo r+oo 

1E(e~ sX ) = / e- st dfi X (t) = e- sa / e- s(t - a) dfi x (t) 

J a J a 

exists for every real number s > 0. Hence the function L : s —>■ JE(e~ sX ) is 
analytic on the half-plane {^tz > 0}, continuous on the boundary of this half- 
plane and lim JE(e ltx ) = 0. By unicity of the analytic continuation, necessarily: 

(• » ::oc 

L(s) = <p(-s), Vs,9f?(s)>0, 
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where p has been introduced in (45). But it has been proven in Proposition 6.24 



that (p cannot be analytically extended on the half-plane {3ftz < 0}. There is a 
contradiction: X's support cannot be a half-line [a, +oo[. 

In the same way, if we suppose that the support of W CT is ] — oo, (3] for a given 
real number (3, we are led to a contradiction, because if cannot be analytically 



extended on the whole half-plane {$t.z > 0} (Proposition 6.24). 



7.2 Connection between the distribution of W DT and the 
density of W CT 

Proposition 7.28 Let \i be the distribution ofW DT (it is a probability measure 
on R). 

1- W CT admits a density p on M given by 



Ww > 0, p{w) 



Ww < 0, p{w) 



1 1 



1 1 



-w 



v m e (») dn{v) 



]0,+oo[ 



\w\ 



--1 



\v\ m e (*) d/j,(v) 



-oo,0 



2- T7ie density p is infinitely differentiable and increasing on M<o, infinitely dif- 
ferentiable and decreasing on M>o/ it is not continuous at 0: lim p(w) = +oo. 

Ln particular, the distribution is unimodal, the mode is 0. 

Proof. 1- To exhibit a density, let's take any real-valued bounded continuous 



function h defined on K and, thanks to the martingale connection (18), compute 

lE{h{W CT )) = h{uv)g{u)dudfi{v) 
Jr Jo 

where g is the density of After the change of variable w = uv, we get 
JE(h(W CT )) = [ d 4^f h{w)g (™) dw 

+ 1 ^ r°° h{ v)a (-) dw - 

i]0,+oo[ V J \VJ 

Remember that W CT has no point mass (see Section [7J introductory paragraph), 
so we get that W CT admits a density given by 



p(w) = i R<0 H 



-oo,0[ 



o (») ik 1 + 



]0,+oo[ 



w\ dfi(v) 



(47) 
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The only point to verify is that the integrals in Formula (47) are well defined. 



The density g is explicit. To lighten the notations, we consider the case when we 
start from one ball (u — 1). In this case, 



1 1 



-X' 



e i- x >o 



{Ah 



» 1 \v\ > 



so that, for any nonzero w, 

TV (") = C|w| 

\v\ \ V J 

is bounded function of v. 

2- Let's prove that lim p(w) = +oo, looking at 



-\w\ & \v\ a 



lim w-> 



d/i(v). 



]0,+oo[ 



The last integral, for any w < 1, is greater than 



]0,+oo[ 



so that it is sufficient to prove that this integral is a positive constant. If not, 
this integral would be equal to zero, and this happens only in the case when \x 



does not charge any point in ]0, +oo[. By the martingale connection (18), this 
would imply that the support of W CT is included in ] — oo, 0], which is not the 



case because of Proposition 7.27 



The result on p's limit at is proved the same way. Differentiability is 
immediate by dominated convergence and monotonicity comes from derivation of 



Formula (47) 



Remark 7.29 The distribution of W is not symmetric around (the expec- 
tation equals 4 ^ 0). 



7.3 Fourier inversion 



The characteristic function T is not integrable. Nevertheless, Formulae (23) and 



(46), imply straighforwardly that, for any real x ^ 0, 

1 



F\ x ) = — T(x)[F a {x)g\x) - 1] 



mx 



and that T 1 is in L 1 . Theorem 7.30|gives an explicit expression of the density of 



W CT by means of inverse Fourier transform of J-', completing Proposition 7.28 
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Theorem 7.30 The density p on R of the random variable W is given, for 
any x ^ 0, by 



p(x) 



1 



2mx 



e~ rtx T\t) dt. 



(49) 



Proof. Let F be the probability distribution function of W CT . We are going to 
show that Vx 7^ 0, 

F(x + h)- F{x) 1 



lim 

h-*o 



h 



2mx 



e- ltx F(t) dt, 



(50) 



which is sufficient to prove that W CT admits a continuous density given by (49). 
For any h ^ 0, let dh be the function defined on R \ {0} by 

■y g— ith 

* w := ST 

and continuated by continuity at 0. It follows from the general Fourier inversion 
theorem (see for instance Lukacs [12] th. 3.2.1. p 38) that Vx 6 R, V/i ^ 0, since 
x and x + h are continuity points of F (remember that F is continuous because 
its characteristic function tends to at infinity), 



F(x + h)- F(x) 
h 



lim ItA x ) 



where 



ItM : =^/ e- itx d h {t)F{t) dt 
Integrating by parts implies that, for any x ^ 0, 



r(2), 
l T,h\ 



[ (3) ( 
T,h\ 



where 



1 

2~n 
1 



e —d h {T)T(T) + e — dh {-T)T{-T) 

IX IX 



2mx 
1 

2mx 



T 



e- ltx d h {t)F{t)dt, 



T 

r 



e- ltx d' h {t)T{t)dt. 



T 



It is elementary to see that dh(t) has the following properties: V7t ^ 0,V£ ^ 0, 

th 



14(01 



sin 

W 

2 



< min < 1, 



\th\ 



(51) 
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K(t)|<min{M ||}. (52) 

Since T is bounded (it is a characteristic function) and since tends to at 
infinity, 

lim ifiUx) = 0. 

Since T 1 G L 1 , (5f ) and Lebesgue dominated convergence theorem lead to 

At least, (52) implies that d! h T G L 1 so that, by dominated convergence, 

So, for any x ^ and h ^ 0, 
F(x + h) - F(x) 1 



h 2inx 



[ e- ltx d h (t)F{t)dt+ — ?— / e- itx d' h {t)F{t)dt. 



To get (50), it is now sufficient to take the limit when /i — > 0, using dominated 
convergence and (52). ■ 

Remark We have not found the following result in the literature but the argu- 
ments of this proof lead to: 

Proposition 7.31 Let T be the characteristic junction of a probability distribu- 
tion function F. Suppose that T is derivable, T' G L 1 (T is not necessarily in 
L 1 ) and ^p- G L 1 . Then F admits a density p given for all x ^ by 



8 Concluding remarks 
8.1 More colours 

The same questions arise naturally for limit laws of large urn processes with 
any finite number of colours. Embedding in continuous time, martingale con- 
nection, dislocation equations on elementary limit distributions and differential 
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system (23) on Fourier transforms or on formal Laplace power series can be gen- 
eralized. However, the resolution of (23) relies on the question of its integrability, 
even if an explicit closed form of its solutions may not be necessary to derive 
properties of the corresponding distributions. 

The space requirements of an m-ary search tree is a special case of Polya- 
Eggenberger urn process with m— 1 colours (see [6 J for example). Because of the 
negativeness of the diagonal entries — 1, —2, ■ ■ ■ , — (m — 1) of its remplacement 
matrix, the corresponding continuous time Markov process is not a branching 
process. However, the discrete time process of an m-ary search tree's nodes can 
be embedded into a branching process. When m > 27, the corresponding limit 
laws can be studied with the same method as in the present paper. This is the 
subject of a forthcoming companion paper. 



8.2 Laplace series 



Remember from Section 4.2 that X (resp. Y) is the martingale limit W CT of the 
continuous time urn process starting from (1,0) (resp. from (0, 1)). For n > 0, 
let 

a n = E(X n ) and b n = E(F n ), 

and let F and G be the Laplace series of X and Y, i.e. the formal exponential 
series of the moments: 

F(T) = V °^T n and G(T) = V ^T" e MTll. 
^— ' n\ n! 



n>0 



n>0 



From equations (21) we write recursion relations between the a n and the b Ti 



Thanks to the multinomial formula, they arrange themselves into the differential 
system with boundary conditions: 

f F(T) + mTF'iT) = F(T) a+1 G(T) b 
G(T) + mTG'(T) = F(T) c G{T) d+1 
F(0) = (7(0) = 1 

F'(0) = h - and G'(0) = -|. 



(53) 



The fact that the urn is large implies that Equations (53) characterize the mo- 
ments of X and Y. Indeed, proceed by recursion: for any n > 2, v n = (a n , b n ) is 
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the solution of a linear system of the form (R — nml)(v n ) = [polynomial function 
of v i, • • • , f n _i], R being the replacement matrix of the process Since the urn 
is large, nm > nS/2 > S is not an eigenvalue of R. 

A remarkable fact, which explains why we have worked with characteristic 
functions and not with Laplace transforms, is that, for non triangular urns, i.e. 
when be ^ 0, series F and G have a radius of convergence equal to (Corol- 
lary 6.25). 



8.3 Question 

The main theorem provides a family of distributions, those of the W s, indexed 
by the three parameters S, m, b of the urn and by the initial condition (a, (3). A 
challenging question is: can the physical relations between these distributions 
be translated into relations between the Abelian integrals? In otherwords, can 
the addition formulas between Abelian integrals be interpreted by a combinato- 
rial/probabilistic approach using these distributions? 
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